How to Generate Music using a LSTM Neural Network in Keras
クローン
code:sh
git clone git@github.com:Skuldur/Classical-Piano-Composer.git
環境構築
code:sh
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install music21 keras tensorflow h5py
code:py
import glob
from music21 import converter, instrument, note, chord
code:py
import glob
notes = []
for file in glob.glob("midi_songs/*.mid"):
midi = converter.parse(file)
notes_to_parse = None
parts = instrument.partitionByInstrument(midi)
if parts: # file has instrument parts
notes_to_parse = parts.parts0.recurse() else: # file has notes in a flat structure
notes_to_parse = midi.flat.notes
for element in notes_to_parse:
if isinstance(element, note.Note):
notes.append(str(element.pitch))
elif isinstance(element, chord.Chord):
notes.append('.'.join(str(n) for n in element.normalOrder))
code:py
from keras.utils import np_utils
n_vocab = len(set(notes))
sequence_length = 100
# get all pitch names
pitchnames = sorted(set(item for item in notes))
# create a dictionary to map pitches to integers
note_to_int = dict((note, number) for number, note in enumerate(pitchnames))
network_input = []
network_output = []
# create input sequences and the corresponding outputs
for i in range(0, len(notes) - sequence_length, 1):
network_input.append([note_to_intchar for char in sequence_in]) n_patterns = len(network_input)
# reshape the input int a format compatible with LSTM layers
network_input = numpy.reshape(network_input, (n_patterns, sequence_length, 1))
# normalize input
network_input = network_input / float(n_vocab)
network_output = np_utils.to_categorical(network_output)
Model
Finally we get to designing the model architecture. In our model we use four different types of layers:
LSTM layers is a Recurrent Neural Net layer that takes a sequence as an input and can return either sequences (return_sequences=True) or a matrix.
Dropout layers are a regularisation technique that consists of setting a fraction of input units to 0 at each update during the training to prevent overfitting. The fraction is determined by the parameter used with the layer.
Dense layers or fully connected layers is a fully connected neural network layer where each input node is connected to each output node.
The Activation layer determines what activation function our neural network will use to calculate the output of a node.
code:py
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.layers import Activation
model = Sequential()
model.add(LSTM(
256,
input_shape=(network_input.shape1, network_input.shape2), return_sequences=True
))
model.add(Dropout(0.3))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(256))
model.add(Dense(256))
model.add(Dropout(0.3))
model.add(Dense(n_vocab))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
For this tutorial we will use a simple network consisting of three LSTM layers, three Dropout layers, two Dense layers and one activation layer. I would recommend playing around with the structure of the network to see if you can improve the quality of the predictions.
code:py
from keras.callbacks import ModelCheckpoint
filepath = "weights-improvement-{epoch:02d}-{loss:.4f}-bigger.hdf5"
checkpoint = ModelCheckpoint(
filepath, monitor='loss',
verbose=0,
save_best_only=True,
mode='min'
)
model.fit(network_input, network_output, epochs=200, batch_size=64, callbacks=callbacks_list)
To make sure that we can stop the training at any point in time without losing all of our hard work, we will use model checkpoints. Model checkpoints provide us with a way to save the weights of the network nodes to a file after every epoch. This allows us to stop running the neural network once we are satisfied with the loss value without having to worry about losing the weights. Otherwise we would have to wait until the network has finished going through all 200 epochs before we could get the chance to save the weights to a file.
code:py
model = Sequential()
model.add(LSTM(
512, # 256?
input_shape=(network_input.shape1, network_input.shape2), return_sequences=True
))
model.add(Dropout(0.3))
model.add(LSTM(512, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(512))
model.add(Dense(256))
model.add(Dropout(0.3))
model.add(Dense(n_vocab))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
# Load the weights to each node
model.load_weights('weights.hdf5')